Batched Bandit Problems
نویسندگان
چکیده
Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy that operates under this contraint and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.
منابع مشابه
MAGMA Batched: A Batched BLAS Approach for Small Matrix Factorizations and Applications on GPUs
A particularly challenging class of problems arising in many applications, called batched problems, involves linear algebra operations on many small-sized matrices. We proposed and designed batched BLAS (Basic Linear Algebra Subroutines), Level-2 GEMV and Level-3 GEMM, to solve them. We illustrate how to optimize batched GEMV and GEMM to assist batched advance factorization (e.g. bi-diagonaliza...
متن کاملOptimizing the SVD Bidiagonalization Process for a Batch of Small Matrices
A challenging class of problems arising in many GPU applications, called batched problems, involves linear algebra operations on many small-sized matrices. We designed batched BLAS (Basic Linear Algebra Subroutines) routines, and in particular the Level-2 BLAS GEMV and the Level-3 BLAS GEMM routines, to solve them. We proposed device functions and big-tile settings in our batched BLAS design. W...
متن کاملDesign of a Hybrid Genetic Algorithm for Parallel Machines Scheduling to Minimize Job Tardiness and Machine Deteriorating Costs with Deteriorating Jobs in a Batched Delivery System
This paper studies the parallel machine scheduling problem subject to machine and job deterioration in a batched delivery system. By the machine deterioration effect, we mean that each machine deteriorates over time, at a different rate. Moreover, job processing times are increasing functions of their starting times and follow a simple linear deterioration. The objective functions are minimizin...
متن کاملBatched bin packing
We introduce and study the batched bin packing problem (BBPP), a bin packing problem in which items become available for packing incrementally, one batch at a time. A batched algorithm must pack a batch before the next batch becomes known. A batch may contain several items; the special case when each batch consists of merely one item is the well-studied on-line bin packing problem. We obtain lo...
متن کاملPolicy Evaluation and Optimization with Continuous Treatments
We study the problem of policy evaluation and learning from batched contextual bandit data when treatments are continuous, going beyond previous work on discrete treatments. Previous work for discrete treatment/action spaces focuses on inverse probability weighting (IPW) and doubly robust (DR) methods that use a rejection sampling approach for evaluation and the equivalent weighted classificati...
متن کامل